Keyphrase Extraction and Grouping Based on Association Rules
نویسندگان
چکیده
Keyphrases are important in capturing the content of a document and thus useful for many natural language processing tasks such as Information Retrieval, Document Classification, and Text Summarization. Keyphrase extraction aims to identify multi-word sequences from a collection of documents that more or less correspond to keyphrases. In this paper, we propose a new method for keyphrase extraction based on association rule mining. Redundant multi-word sequences or synonymous phrases inevitably make up a big part of the keyphrases extracted. With association rules, we can also reduce the redundancy by grouping the related keyphrases that have strong co-occurrence frequencies. We further apply our keyphrase extraction and grouping solution to Information Retrieval. By both distinguishing and grouping keyphrases, we are able to achieve improved performance for Information Retrieval.
منابع مشابه
روش جدید متنکاوی برای استخراج اطلاعات زمینه کاربر بهمنظور بهبود رتبهبندی نتایج موتور جستجو
Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...
متن کاملKPCatcher - a keyphrase extraction system for enterprise videos
This paper introduces KPCatcher (keyphrase catcher). The value of our work lies in providing concrete solutions to building a real keyphrase extraction product for enterprise videos. KPCatcher has been designed to robustly extract a ranked list of keyphrases from enterprise videos, independent of the domain. It treats noun phrases in the transcript as candidate keyphrases and scores them by agg...
متن کاملCoherent Keyphrase Extraction via Web Mining
Keyphrases are useful for a variety of purposes, including summarizing, indexing, labeling, categorizing, clustering, highlighting, browsing, and searching. The task of automatic keyphrase extraction is to select keyphrases from within the text of a given document. Automatic keyphrase extraction makes it feasible to generate keyphrases for the huge number of documents that do not have manually ...
متن کاملWikiRank: Improving Keyphrase Extraction Based on Background Knowledge
Keyphrase is an efficient representation of the main idea of documents. While background knowledge can provide valuable information about documents, they are rarely incorporated in keyphrase extraction methods. In this paper, we propose WikiRank, an unsupervised method for keyphrase extraction based on the background knowledge from Wikipedia. Firstly, we construct a semantic graph for the docum...
متن کاملApproximate Matching for Evaluating Keyphrase Extraction
We propose a new evaluation strategy for keyphrase extraction based on approximate keyphrase matching. It corresponds well with human judgments and is better suited to assess the performance of keyphrase extraction approaches. Additionally, we propose a generalized framework for comprehensive analysis of keyphrase extraction that subsumes most existing approaches, which allows for fair testing ...
متن کامل